Reading
Art2Music: Generating Music for Art Images with Multi-modal Feeling Alignment
Hong, Jiaying, Zhu, Ting, Markchom, Thanet, Liang, Huizhi
With the rise of AI-generated content (AIGC), generating perceptually natural and feeling-aligned music from multimodal inputs has become a central challenge. Existing approaches often rely on explicit emotion labels that require costly annotation, underscoring the need for more flexible feeling-aligned methods. To support multimodal music generation, we construct ArtiCaps, a pseudo feeling-aligned image-music-text dataset created by semantically matching descriptions from ArtEmis and MusicCaps. We further propose Art2Music, a lightweight cross-modal framework that synthesizes music from artistic images and user comments. In the first stage, images and text are encoded with OpenCLIP and fused using a gated residual module; the fused representation is decoded by a bidirectional LSTM into Mel-spectrograms with a frequency-weighted L1 loss to enhance high-frequency fidelity. In the second stage, a fine-tuned HiFi-GAN vocoder reconstructs high-quality audio waveforms. Experiments on ArtiCaps show clear improvements in Mel-Cepstral Distortion, Frechet Audio Distance, Log-Spectral Distance, and cosine similarity. A small LLM-based rating study further verifies consistent cross-modal feeling alignment and offers interpretable explanations of matches and mismatches across modalities. These results demonstrate improved perceptual naturalness, spectral fidelity, and semantic consistency. Art2Music also maintains robust performance with only 50k training samples, providing a scalable solution for feeling-aligned creative audio generation in interactive art, personalized soundscapes, and digital art exhibitions.
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
- Europe > United Kingdom > England > Berkshire > Reading (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
High-Resolution Probabilistic Data-Driven Weather Modeling with a Stretched-Grid
Nordhagen, Even Marius, Haugen, Håvard Homleid, Salihi, Aram Farhad Shafiq, Ingstad, Magnus Sikora, Nipen, Thomas Nils, Seierstad, Ivar Ambjørn, Frogner, Inger-Lise, Clare, Mariana, Lang, Simon, Chantry, Matthew, Dueben, Peter, Kristiansen, Jørn
We present a probabilistic data-driven weather model capable of providing an ensemble of high spatial resolution realizations of 87 variables at arbitrary forecast length and ensemble size. The model uses a stretched grid, dedicating 2.5 km resolution to a region of interest, and 31 km resolution elsewhere. Based on a stochastic encoder-decoder architecture, the model is trained using a loss function based on the Continuous Ranked Probability Score (CRPS) evaluated point-wise in real and spectral space. The spectral loss components is shown to be necessary to create fields that are spatially coherent. The model is compared to high-resolution operational numerical weather prediction forecasts from the MetCoOp Ensemble Prediction System (MEPS), showing competitive forecasts when evaluated against observations from surface weather stations. The model produced fields that are more spatially coherent than mean squared error based models and CRPS based models without the spectral component in the loss.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > United Kingdom > England > Berkshire > Reading (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
FNP: Fourier Neural Processes for Arbitrary-Resolution Data Assimilation
Data assimilation is a vital component in modern global medium-range weather forecasting systems to obtain the best estimation of the atmospheric state by combining the short-term forecast and observations. Recently, AI-based data assimilation approaches have attracted increasing attention for their significant advantages over traditional techniques in terms of computational consumption.
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > West Midlands > Coventry (0.04)
- Asia > Singapore (0.04)
- (7 more...)
- North America > United States (0.68)
- Asia > China > Beijing > Beijing (0.04)
- Europe > United Kingdom > England > Berkshire > Reading (0.04)
- Asia > Japan (0.04)
- Energy (0.46)
- Government > Regional Government (0.46)
High-dimensional Bayesian filtering through deep density approximation
In this work, we benchmark two recently developed deep density methods for nonlinear filtering. Starting from the Fokker--Planck equation with Bayes updates, we model the filtering density of a discretely observed SDE. The two filters: the deep splitting filter and the deep BSDE filter, are both based on Feynman--Kac formulas, Euler--Maruyama discretizations and neural networks. The two methods are extended to logarithmic formulations providing sound and robust implementations in increasing state dimension. Comparing to the classical particle filters and ensemble Kalman filters, we benchmark the methods on numerous examples. In the low-dimensional examples the particle filters work well, but when we scale up to a partially observed 100-dimensional Lorenz-96 model the particle-based methods fail and the logarithmic deep density method prevails. In terms of computational efficiency, the deep density methods reduce inference time by roughly two to five orders of magnitude relative to the particle-based filters.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Berkshire > Reading (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation
Phisannupawong, Thaweerath, Damanik, Joshua Julian, Choi, Han-Lim
Flight delay prediction has become a key focus in air traffic management, as delays highlight inefficiencies that impact overall network performance. This paper presents a lightweight large language model-based multimodal flight delay prediction, formulated from the perspective of air traffic controllers monitoring aircraft delay after entering the terminal area. The approach integrates trajectory representations with textual aeronautical information, including flight information, weather reports, and aerodrome notices, by adapting trajectory data into the language modality to capture airspace conditions. The experiments show that the model consistently achieves sub-minute prediction error by effectively leveraging contextual information related to the sources of delay, fulfilling the operational standard for minute-level precision. The framework demonstrates that linguistic understanding, when combined with cross-modality adaptation of trajectory data, enhances delay prediction. Moreover, the approach shows practicality and potential scalability for real-world operations, supporting real-time updates that refine predictions upon receiving new operational information.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > South Korea > Incheon > Incheon (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Transportation > Passenger (1.00)
- Transportation > Air (1.00)
- Consumer Products & Services > Travel (1.00)
- Transportation > Infrastructure & Services > Airport (0.68)
Synergistic Neural Forecasting of Air Pollution with Stochastic Sampling
Abeysinghe, Yohan, Munir, Muhammad Akhtar, Baliah, Sanoojan, Sarafian, Ron, Khan, Fahad Shahbaz, Rudich, Yinon, Khan, Salman
Air pollution remains a leading global health and environmental risk, particularly in regions vulnerable to episodic air pollution spikes due to wildfires, urban haze and dust storms. Accurate forecasting of particulate matter (PM) concentrations is essential to enable timely public health warnings and interventions, yet existing models often underestimate rare but hazardous pollution events. Here, we present SynCast, a high-resolution neural forecasting model that integrates meteorological and air composition data to improve predictions of both average and extreme pollution levels. Built on a regionally adapted transformer backbone and enhanced with a diffusion-based stochastic refinement module, SynCast captures the nonlinear dynamics driving PM spikes more accurately than existing approaches. Leveraging on harmonized ERA5 and CAMS datasets, our model shows substantial gains in forecasting fidelity across multiple PM variables (PM$_1$, PM$_{2.5}$, PM$_{10}$), especially under extreme conditions. We demonstrate that conventional loss functions underrepresent distributional tails (rare pollution events) and show that SynCast, guided by domain-aware objectives and extreme value theory, significantly enhances performance in highly impacted regions without compromising global accuracy. This approach provides a scalable foundation for next-generation air quality early warning systems and supports climate-health risk mitigation in vulnerable regions.
- North America > United States (0.14)
- Asia > China (0.05)
- Europe > Middle East (0.04)
- (12 more...)
Trajectory learning for ensemble forecasts via the continuous ranked probability score: a Lorenz '96 case study
Ephrati, Sagy, Woodfield, James
This paper demonstrates the feasibility of trajectory learning for ensemble forecasts by employing the continuous ranked probability score (CRPS) as a loss function. Using the two-scale Lorenz '96 system as a case study, we develop and train both additive and multiplicative stochastic parametrizations to generate ensemble predictions. Results indicate that CRPS-based trajectory learning produces parametrizations that are both accurate and sharp. The resulting parametrizations are straightforward to calibrate and outperform derivative-fitting-based parametrizations in short-term forecasts. This approach is particularly promising for data assimilation applications due to its accuracy over short lead times.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Berkshire > Reading (0.04)
- Europe > Switzerland (0.04)
Blade: A Derivative-free Bayesian Inversion Method using Diffusion Priors
Zheng, Hongkai, Wang, Austin, Wu, Zihui, Huang, Zhengyu, Baptista, Ricardo, Yue, Yisong
Derivative-free Bayesian inversion is an important task in many science and engineering applications, particularly when computing the forward model derivative is computationally and practically challenging. In this paper, we introduce Blade, which can produce accurate and well-calibrated posteriors for Bayesian inversion using an ensemble of interacting particles. Blade leverages powerful data-driven priors based on diffusion models, and can handle nonlinear forward models that permit only black-box access (i.e., derivative-free). Theoretically, we establish a non-asymptotic convergence analysis to characterize the effects of forward model and prior estimation errors. Empirically, Blade achieves superior performance compared to existing derivative-free Bayesian inversion methods on various inverse problems, including challenging highly nonlinear fluid dynamics.
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > United Kingdom > England > Berkshire > Reading (0.04)